Contextual distinctiveness
Definition
Contextual distinctiveness refers to how many different psychological, lexical, or semantic contexts a word typically appears in. Words that are more contextually distinct—appearing in fewer, more specific contexts—are often considered more difficult or advanced (Berger et al., 2017).
Methodology
Several approaches are used to capture contextual distinctiveness:
- EAT: Variety of responses to a word in free association (Kiss et al., 1973)
- USF: Number of different stimuli that elicit a word in free association (Nelson, McEvoy, & Schreiber, 1998)
- McD: Probability of co-occurrence with frequent words in a narrow window (McDonald & Shillcock, 2001)
- Sem_D: Semantic variability across contextual environments in large corpora (Hoffman, Ralph, & Rogers, 2013)
- LSA: Hidden (latent) semantic structures beyond keywords (Laundauer et al., 2014)
Corpus used
- EAT (Edinburgh Associative Thesaurus), written
- USF Free Association Norms, written
- BNC (British National Corpus), written & spoken
- TASA (Touchstone Applied Sciences Associates), written
Register
- Written (EAT, USF, BNC, TASA)
- Spoken (BNC: Co-occurrence)
Calculated indices
Free association response types (EAT)
- Indices:
- EAT_types_AW
- EAT_types_CW
- EAT_types_FW
Free association stimuli elicited (USF)
- Indices:
- USF_AW
- USF_CW
- USF_FW
Co-occurrence probability (McD)
- Methodology: Kullback-Leibler divergence (relative entropy)
- Indices:
- McD_CD_AW
- McD_CD_CW
- McD_CD_FW
Semantic distinctiveness (Sem_D)
- Methodology: Variability of contexts in which a word appears across 1,000-word text sections; Reversed natural log of mean LSA cosine similarity across chunks
- Indices:
- Sem_D_AW
- Sem_D_CW
- Sem_D_FW
Latent semantic analysis (LSA)
- Indices:
- lsa_average_top_three_cosine
- lsa_max_similarity_cosine
- lsa_average_all_cosine
References
- Berger, C., Crossley, S. & Kyle, K. (2017). Using Native-Speaker Psycholinguistic Norms to Predict Lexical Proficiency and Development in Second-Language Production. Applied Linguistics, 40(1), 22–42. https://doi.org/10.1093/applin/amx005
- BNC Consortium. (2007). British national corpus. Oxford Text Archive Core Collection.
- Hoffman, P., Lambon Ralph, M.A. & Rogers, T.T. (2013). Semantic diversity: A measure of semantic ambiguity based on variability in the contextual usage of words. Behav Res 45, 718–730. https://doi.org/10.3758/s13428-012-0278-x
- Kiss, G. R. (1973). Grammatical word classes: A learning process and its simulation. In Psychology of learning and motivation (Vol. 7, pp. 1-41). Academic Press. https://doi.org/10.1016/S0079-7421(08)60064-X
- Landauer, T. K., Foltz, P. W., & Laham, D. (1998). An introduction to latent semantic analysis. Discourse processes, 25(2-3), 259-284. https://doi.org/10.1080/01638539809545028
- McDonald, S. A., & Shillcock, R. C. (2001). Rethinking the Word Frequency Effect: The Neglected Role of Distributional Information in Lexical Processing. Language and Speech, 44(3), 295-322. https://doi.org/10.1177/00238309010440030101
- Nelson, D. L., McEvoy, C. L., & Schreiber, T. A. (2004). The University of South Florida free association, rhyme, and word fragment norms. Behavior Research Methods, Instruments, & Computers, 36(3), 402-407. https://doi.org/10.3758/BF03195588